Add budget-window batch economy endpoint by MaxGhenis · Pull Request #3387 · PolicyEngine/policyengine-api

MaxGhenis · 2026-04-09T00:59:49Z

Summary

add a dedicated budget-window economy endpoint that batches yearly reform-impact work server-side
cap active yearly jobs and return aggregate progress, completed years, queued years, and final totals
cover completed, queued, and error flows in the economy service tests

Testing

FLASK_DEBUG=1 uv run --python 3.12 pytest tests/unit/services/test_economy_service.py
uv run --python 3.12 ruff format --check .

Companion app change: PolicyEngine/policyengine-app-v2#930

The axes code path silently discarded all exceptions (`pass`), causing variables like NJ gross income to return null with no error trace. Now logs the full traceback via logging.exception(). Fixes #3322 Co-Authored-By: Claude Opus 4.6 <[email protected]>

codecov · 2026-04-09T01:13:22Z

Codecov Report

❌ Patch coverage is 74.25000% with 103 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.79%. Comparing base (6af28d7) to head (b82ace8).
⚠️ Report is 10 commits behind head on master.

Files with missing lines	Patch %	Lines
policyengine_api/services/economy_service.py	73.55%	38 Missing and 35 partials ⚠️
...olicyengine_api/services/reform_impacts_service.py	73.43%	14 Missing and 3 partials ⚠️
policyengine_api/routes/economy_routes.py	85.36%	3 Missing and 3 partials ⚠️
policyengine_api/api.py	40.00%	2 Missing and 1 partial ⚠️
policyengine_api/data/data.py	62.50%	3 Missing ⚠️
policyengine_api/country.py	66.66%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3387      +/-   ##
==========================================
+ Coverage   72.20%   72.79%   +0.59%     
==========================================
  Files          56       56              
  Lines        2425     2827     +402     
  Branches      423      514      +91     
==========================================
+ Hits         1751     2058     +307     
- Misses        595      649      +54     
- Partials       79      120      +41

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

anth-volk · 2026-04-10T17:34:53Z

I think it is dangerous for request-serving code to issue schema-management SQL (SHOW COLUMNS, ALTER TABLE ...) against production. Please move that to a manual migration step before the PR is pushed, or introduce a migration tool such as Alembic so schema changes are applied explicitly rather than opportunistically during live traffic. That will reduce the chance of production-breaking bugs around deploy order, permissions, or lock contention.

Separately, the new caching / coordination flow feels unnecessarily complex for what should be a standard cache. Using DB rows plus advisory locks plus provisional execution IDs introduces a lot of race-prone state management. Please redo this around Redis instead of SQL locking, so we have a conventional cache / in-flight coordination mechanism rather than custom lock orchestration in request code.

anth-volk

I think this should be reworked so the orchestration lives in the simulation API repo rather than in API v1.

Benefits:

Much less request-time table editing and coordination logic in API v1. We would not need request-serving code to manage provisional rows, execution ID promotion, stale-claim cleanup, or schema-related DB behavior.
Better scalability. The simulation API repo is already the natural boundary for spawning and tracking simulation work, and Modal can scale the worker side horizontally for us.
Fewer race-condition risks. The current DB-lock/provisional-claim flow is a lot of custom concurrency machinery for what is essentially a batch execution problem.
More leverage from Modal. We already get async job spawning, polling by job ID, worker isolation, and container scaling. We should use that instead of rebuilding orchestration in Flask request code.
Easier eventual integration into API v2 alpha. If the batch orchestration lives in the simulation API repo, API v1 and API v2 alpha can both consume the same backend capability instead of duplicating orchestration logic in each API layer.

Implementation outline:

In the simulation API repo:

Add a batch endpoint for this workflow, e.g. /simulate/economy/comparison/batch or /simulate/economy/budget-window.
The request should accept either:
- a base simulation payload plus start_year, window_size, and max_parallel, or
- a fully expanded list of yearly payloads plus max_parallel.
On submission, create one parent batch job and return a single batch job ID immediately.
The batch job should fan out child yearly simulations as separate Modal executions, up to the requested parallelism limit.
Track child job IDs and child statuses under the parent batch job.
Expose polling for the parent batch job so the caller can retrieve:
- overall status
- progress
- completed years
- running years
- queued years
- failed years / error message
- final aggregated annual impacts and totals once complete
The aggregation logic for the budget window should also live there, so the backend that owns the fan-out also owns the child-job status and final merged result.

In API v1:

Keep the budget-window route and request parsing.
Keep the budget-window response contract, since this PR is already moving toward the right shape for the frontend.
Replace the current per-year orchestration with a thin adapter:
- build one batch request
- submit it to the simulation API repo
- store/poll one parent batch job ID
- map the batch status/result into the existing BudgetWindowEconomicImpactResult response
Remove the request-time thread pool, DB advisory locks, provisional execution IDs, stale-claim handling, and per-year reform_impact coordination.
If caching is still needed in API v1, use Redis as a standard cache:
- cache completed batch results by a canonical request key
- optionally cache “batch job currently in progress for this key”
- do not use SQL locking as the cache coordination mechanism

That would leave API v1 as a thin HTTP adapter and move the concurrency/orchestration concerns into the simulation API repo, where Modal is already doing the real execution work. It should be simpler to reason about, easier to scale, and substantially less race condition-prone than the current lock-based design.

If this direction makes sense, I’d be happy to take it on in follow-up and flag @MaxGhenis.

MaxGhenis and others added 2 commits March 10, 2026 08:44

Add budget window batch economy endpoint

bf46019

MaxGhenis added 12 commits April 8, 2026 21:51

Harden budget window batch API

b0f0f8a

Merge origin/master into codex/budget-window-batch

ae6b88c

Address budget window review findings

cdfbf01

Prevent duplicate budget window jobs across workers

f9062cb

Harden reform impact claim deduping

2542fd9

Tighten budget window claim recovery

f089af9

Prevent stale claim takeover

e34d452

Backfill reform impact schema lazily

d9dd005

Backfill reform impact dataset column

e0b62cb

Address budget window review findings

7f17329

Avoid stale Utah macro cache in CI

6035d0e

Mark pre-submission setup failures as errors

b82ace8

anth-volk requested changes Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add budget-window batch economy endpoint#3387

Add budget-window batch economy endpoint#3387
MaxGhenis wants to merge 14 commits intomasterfrom
codex/budget-window-batch

MaxGhenis commented Apr 9, 2026

Uh oh!

codecov bot commented Apr 9, 2026 •

edited

Loading

Uh oh!

anth-volk commented Apr 10, 2026

Uh oh!

anth-volk left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

MaxGhenis commented Apr 9, 2026

Summary

Testing

Uh oh!

codecov bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

anth-volk commented Apr 10, 2026

Uh oh!

anth-volk left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Apr 9, 2026 •

edited

Loading

anth-volk left a comment •

edited

Loading